Acoustic Analysis of Whispered Speech for Phoneme and Speaker Dependency
نویسندگان
چکیده
Whisper is used by speakers in certain circumstances to protect personal information. Due to the differences in production mechanisms between neutral and whispered speech, there are considerable differences between the spectral structure of neutral and whispered speech, such as formant shifts and shifts in spectral slope. This study analyzes the dependency of these differences on speakers and phonemes by applying a Vector Taylor Series (VTS) approximation to a model of the transformation of neutral speech into whispered speech, and estimating the parameters of this model using an Expectation Maximization (EM) algorithm. The results from this study shed light on the speaker and phoneme dependency of the shifts of neutral to whisper speech, and suggest that similarly derived model adaptation or compensation schemes for whisper speech/speaker recognition will be highly speaker dependent.
منابع مشابه
Acoustic analysis and feature transformation from neutral to whisper for speaker identification within whispered speech audio streams
Whispered speech is an alternative speech production mode from neutral speech, which is used by talkers intentionally in natural conversational scenarios to protect privacy and to avoid certain content from being overheard or made public. Due to the profound differences between whispered and neutral speech in vocal excitation and vocal tract function, the performance of automatic speaker identi...
متن کاملAllophone-based acoustic modeling for Persian phoneme recognition
Phoneme recognition is one of the fundamental phases of automatic speech recognition. Coarticulation which refers to the integration of sounds, is one of the important obstacles in phoneme recognition. In other words, each phone is influenced and changed by the characteristics of its neighbor phones, and coarticulation is responsible for most of these changes. The idea of modeling the effects o...
متن کاملSpeaker identification for whispered speech using modified temporal patterns and MFCCs
Speech production variability due to whisper represents a major challenges for effective speech systems. Whisper is used by talkers intentionally in certain circumstances to protect personal privacy. Due to the absence of periodic excitation in the production of whisper, there are considerable differences between neutral and whispered speech in the spectral structure. Therefore, performance of ...
متن کاملPerception and production of boundary tones in whispered dutch
The main cue to interrogativity in Dutch declarative questions is found in the final boundary tone. When whispering, a speaker does not produce the most important acoustic information conveying this: the fundamental frequency. In this paper listeners are shown to perceive the difference between whispered declarative questions and statements, though less clearly than in phonated speech. Moreover...
متن کاملCompensating for speaker or lexical variabilities in speech for emotion recognition
Affect recognition is a crucial requirement for future human machine interfaces to effectively respond to nonverbal behaviors of the user. Speech emotion recognition systems analyze acoustic features to deduce the speaker’s emotional state. However, human voice conveys a mixture of information including speaker, lexical, cultural, physiological and emotional traits. The presence of these commun...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2011